Feature extraction for voice-driven synthesis
نویسنده
چکیده
This paper explores the singing voice from an unusual perspective, not as a musical instrument but as a musical controller. A set of spectral processing algorithms extract features form the input voice. These features are categorized in four groups: excitation, vocal tract, voice quality and context. The extracted values are then transmitted as Open Sound Control (OSC) messages in order to be used in an external synthesis engine. In this document, we provide first a technical description of the algorithms, and in a second part, we detail the components of the system. A practical example of voice-driven synthesis using PureData (Pd) is also presented.
منابع مشابه
An artificial intelligence approach to concatenative sound synthesis
iii Content Overview v-vii List of Figures viii-x List of Tables xi-xii List of Abbreviations xiii-xiv Acknowledgments xv-xvi Author’s Declaration xvii CHAPTER 1: INTRODUCTION 1 1.1 Motivation 1 1.2 Introduction 7 1.3 Objectives 14 1.4 Thesis Structure 18 CHAPTER 2: PRINCIPLES OF CONCATENATIVE SOUND SYNTHESIS 20 2.1 Sound Synthesis 20 2.1.1 Rule-based Model 23 2.1.2 Data-driven Model 27 2.2 Sub...
متن کاملArticulatory acoustic feature applications in speech synthesis
The quality of unit selection speech synthesisers depends significantly on the content of the speech database being used. In this paper a technique is introduced that can highlight mispronunciations and abnormal units in the speech synthesis voice database through the use of articulatory acoustic feature extraction to obtain an additional layer of annotation. A set of articulatory acoustic feat...
متن کاملA method of creating a new speaker²s voicefont in a text-to-speech system
This paper presents a method of creating a new speaker’s voice database (VoiceFont) by which the voice of the donor speaker can be synthesized for mimicking in a text-to-speech system. A VoiceFont creation system, “VoiceFont Builder”, is developed to make the creation process easier and more effective than current systems. The voice feature extraction applied in the system is a simple but power...
متن کاملIntentional voice command detection for completely hands-free speech interface in home environments
We introduce a new class of speech processing, called Intentional Voice Command Detection (IVCD). It is necessary to reject not only noises but also unintended voices to achieve completely hands-free speech interface. Conventional VAD framework is not sufficient for such purpose, and we discuss how we should define IVCD and how we can realize it. We investigate implementation of IVCD from the v...
متن کاملطراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی
Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...
متن کامل